Protocol to analyze and validate transcriptomic changes in PDGFRβ-KO mesenchymal stem cell osteogenic potential in the mouse embryo

Summary Mesenchymal stem/stromal cells (MSCs) can differentiate into osteoblasts under appropriate conditions. PDGFRβ signaling controls MSC osteogenic potential both transcriptomically and in culture. Here, we present a “computer to the bench” protocol to analyze changes in MSC osteogenic potential at transcriptomic and cellular level in the absence of PDGFRβ. We detail the preparation of cells from mouse embryos, the analysis of transcriptomic changes from single-cell RNA-sequencing data, the procedure for MSC derivation and culture, and an osteogenic assay for functional validation. For complete details on the use and execution of this protocol, please refer to Sá da Bandeira et al. (2022).1


SUMMARY
Mesenchymal stem/stromal cells (MSCs) can differentiate into osteoblasts under appropriate conditions. PDGFRb signaling controls MSC osteogenic potential both transcriptomically and in culture. Here, we present a ''computer to the bench'' protocol to analyze changes in MSC osteogenic potential at transcriptomic and cellular level in the absence of PDGFRb. We detail the preparation of cells from mouse embryos, the analysis of transcriptomic changes from single-cell RNA-sequencing data, the procedure for MSC derivation and culture, and an osteogenic assay for functional validation. For complete details on the use and execution of this protocol, please refer to Sá da Bandeira et al. (2022). 1

BEFORE YOU BEGIN
In mice, the first adult-type hematopoietic stem cells (HSCs) arise from hemogenic endothelial cells lining the dorsal aorta in the aorta-gonad-mesonephros region (AGM) at developmental day (E) 10.5. 2 Endothelial cells are surrounded by mesenchymal perivascular cells expressing PDGFRb that were recently shown to support hematopoietic stem and progenitor cell (HSPC) development. 1 Indeed, the AGM HSPC number was significantly reduced in the absence of PDGFRb. Moreover, upon culture, we found that mesenchymal stem/stromal cells (MSCs) derived from PDGFRb-KO AGMs showed reduced hematopoietic support in co-culture experiments and had low ability to differentiate towards bone.
The following protocol describes detailed steps to analyze MSCs from unfractionated E11 AGMs in vivo by single-cell RNA sequencing and proposes a novel method to validate transcriptomic changes in PDGFRb-KO MSCs in single AGM-derived cell culture. Here, we modified a culture method described previously. 3,4 We derived MSCs from single AGMs to study PDGFRb-WT (+/+) and PDGFRb-KO (À/À) embryos individually, and we investigated their bone forming potential at higher scale. a. Prepare a 2.5% w/v collagenase type 1 stock solution by adding 1 g of collagenase type 1-40 mL of sterile PBS and filter using a large syringe with a 0.45 mm filter.
Note: Aliquots of the collagenase type I stock solution can be stored in Eppendorf tubes at À20 C at long-term and thawed at room temperature (20 C-22 C) or for few minutes in a water bath at 37 C on the day of the experiment.
b. Cull pregnant dams and harvest the embryos on ice, in cold filtered PBS buffer solution enriched with 10% fetal calf serum (FCS), and 1% penicillin/streptomycin (PS), henceforth referred to as PBS/FCS/PS. c. Dissect the embryos right after in a petri-dish with PBS/FCS/PS under a stereoscope using a pair of scissors and tweezers as previously described. 14 Note: The time to dissect can vary between scientists and thus all embryos should be kept on ice prior dissection.
d. Ensure that the embryos used are at the desired stage of development by counting the somite pairs. e. Transfer a piece of tissue from each embryo (usually the yolk sac or the head) in an empty labeled 1.5 mL Eppendorf tube and proceed with genotyping immediately (protocol described in steps 3 and 4 below).
Note: These tubes can be frozen at À20 C if not used on the same day. If cells are used for scRNA-seq, genotyping should be done immediately, in parallel with tissue dissection and thus, this may require 2 scientists. If AGM cells are used to derive MSCs, genotyping can be done after cells are seeded the same day or after.  Note: Reagents such as the Coral Load Buffer, MgCl 2 solution and DNA polymerase are included in the Hot Start Taq Plus kit. All other reagents should be ordered separately.
b. Dispense 18 mL of the mix to a 0.2 mL PCR tube and add 2 mL of the genomic DNA sample for a total volume of 20 mL by a short spin down. c. Add the PCR reaction tubes to a thermocycler and run the following program (it takes approx. 120 min): 1 cycle at 95 C for 5 min for initial DNA denaturation, 35 cycles of denaturation at 95 C for 1 min, primer annealing at 58 C for 1 min, extension at 72 C for 1 min, followed by 1 cycle for 10 min at 72 C for the final extension and hold at 10 C until stop instruction. d. Prepare 1.5% agarose gel in 13 TAE Buffer with SYBRâSafe DNA gel stain at 1:10 dilution. e. Let gel polymerize in the chemical hood at room temperature (20 C-22 C) for at least 20 min. f. Set up the polymerized gel in an electrophoresis tank, previously filled with fresh 13 TAE, and load a DNA molecular weight marker (20 mL of EasyLadder I). g. Load the samples and run the gel at 115 V and 400 mA, for 90 min. h. Analyze gel with a UV transilluminator and read the bands for WT at 114 bp and the KO at 320 bp ( Figure 1). Note: The details described above are optimized for this PDGRFb mouse strain and may differ from other murine strains.

Single-cell RNA-sequencing data analysis
Timing: days to weeks This major step describes an end-to-end analysis of scRNA-seq data, to understand the changes occurring in mesenchymal stem/stromal cells (MSCs) in the PDGFRb-KO AGM at a single-cell resolution. The analysis steps broadly correspond with the OSCA Bioconductor workflow, based on the scran 10 and scater 8 Bioconductor packages 15 ; similar steps may be carried out via the Seurat Bioconductor package. 16 Code examples generally refer to a single sample (which we refer to as 'sce_wt'); the analysis steps are almost identical for both WT and KO samples. 5. Library preparation for single-cell analysis. a. Prepare dissociated AGMs in a single-cell suspension as mentioned above to be processed for sample loading and library preparation as described in the protocol ''Chromium Next GEM Single Cell 3 0 Reagent Kits User Guide (v3.1 Chemistry)'' by 10x Genomics. b. Load 7 3 10 3 cells in the Chromium Next Gem Chip G, viable cells are counted in a 1:1 trypan blue ratio (cell range set for 8-11 mm) to calculate the volume of cell suspension required.
Note: A table is provided in the 10x Genomics protocol to cross cell stock concentration (cell/ mL) and the desired targeted cell recovery (for example 7 3 10 3 cells).
c. Single-cell libraries are to be obtained according to the manufacturer's protocol which consist of barcoding, amplifying cDNA, gel-emulsion droplets (GEM) generation and cDNA amplification and quantification, detailed here: ''Chromium Next GEM Single Cell 3 0 Reagent Kits User Guide (v3.1 Chemistry)''. d. Quantify RNA concentration and the quality of the libraries. e. Send your libraries to sequencing. 6. Generating single-cell count matrices.
a. Transfer your sequencing data from the sequencing facility to the computer where the data will be processed.
Note: The exact steps for data transfer will vary, depending on your sequencing provider and your local compute infrastructure.
b. At the command line, use 10x Genomics Cell Ranger to generate single cell feature counts for each sample separately, running ''cellranger count'' with the reference dataset previously downloaded.
Note: Here, we used 10x Genomics reference mm10/GRCm38-3.0.0. The 'id' parameter can be specified to label the sample (eg.: as WT or KO).
7. Import and filter single-cell data. a. Download the 'raw_feature_bc_matrix' directory output from cellranger count for each sample: this will contain three files: barcodes.tsv.gz, features.tsv.gz and matrix.mtx.gz. b. In R, use read10xCounts() from the DropletUtils Bioconductor package 7 to import the Cell Ranger output into R in the SingleCellExperiment format.
c. Generate barcode rank plots ( Figure 2) to monitor the distribution of barcode counts, then run emptyDrops() to identify empty droplets. Remove cells predicted to contain only ambient RNA, using the default false discovery rate (FDR) of 0.1%.

OPEN ACCESS
Note: As of v3, Cell Ranger implements a version of the EmptyDrops algorithm that provides similarly filtered barcode matrices, in the 'filtered_feature_bc_matrix' directory.
8. Annotate and perform quality control on single-cell data. a. Identify mitochondrial genes by annotating each sample using Bioconductor's AnnotationHub service with an appropriate reference.
Note: Here, we used Ensembl mm38.93 for annotation, to match that used in the creation of the 10x Genomics reference.
Note: We specify 'min.mean = 0.1' in the call to computeSumFactors(), to define the minimum (library size-adjusted) average count of genes to be used for normalization. Setting this parameter avoids using very low-abundance genes for the sum factor computation: if too many genes have consistently low counts across all cells, the computed size factors may be close to zero.
c. Run the normalization using the computed size factors with logNormCounts(). d. Model the variance of the log-expression profile for each gene using scran's modelGeneVar-ByPoisson() function. This function decomposes log-expression into technical and biological components based on a mean-variance trend corresponding to Poisson noise and utilizes the size factors computed earlier. e. Finally, define highly variable genes (HVGs) for each sample using getTopHVGs().
Note: modelGeneVar() performs a similar function to modelGeneVarByPoisson(), but tends to understate biological variation in heterogeneous datasets, such as whole unfractionated AGM, as we analyze here. 10. Dimensionality reduction and visualization. Note: In Sá da Bandeira et al., we explicitly set the number of principal components (PCs) to retain using the min.rank and max.rank parameters as shown in this code example. These parameter settings were chosen based on an exploratory PCA, followed by analysis of the resulting scree plots (example in Figure 4). The scree plot illustrates the proportion of variance explained by each PC: the number of PCs to retain is often chosen by identifying an 'elbow' on the plot, beyond which retaining additional PCs explains little additional variance.
CRITICAL: The same number of PCs must be retained in all samples to subsequently integrate the samples downstream (step 13), since Batchelor's correctExperiments() function combines the dimensional reduction matrices for each sample and these are required to be the same size. Subsequently, we are required to choose a number of PCs which explain sufficient variation in all samples.
b. Each sample may be visualized as t-SNE or UMAP plots via scater's runTSNE() and runUMAP() functions; both utilize the PCA reduction just computed.
Note: It is often useful to compare t-SNE reductions of different perplexities 17 side-by-side. Rather than using runTSNE(), it may be advantageous to run the Rtsne() function underlying runTSNE() directly, specifying the desired perplexity and storing the resulting matrix in the re-ducedDims slot of the SingleCellExperiment object with an appropriate name (eg ''TSNE30'' rather than ''TSNE''). 11. Clustering. a. Construct a shared nearest-neighbor (SNN) graph for each sample, using scran's buildSNN-Graph() function. b. Compute an initial clustering for each sample (Figure 5), by using the SNN graph as input to the Walktrap community finding algorithm, using the cluster_walktrap() function from the igraph R package.
Note: In Sá da Bandeira, et al. we use the Walktrap algorithm, but the igraph package allows for the use of other community-finding algorithms, such as Louvain. 1 It may be worthwhile experimenting with different algorithms to explore the robustness of any cell clusters.
c. Apply doublet detection to each sample, for example via scDblFinder's computeDoublet-Density() function, which simulates random artificial doublets from real cells and tries to identify cells whose neighborhood has a high local density of artificial doublets.
Note: It is important to make sure that doublet detection is carried out on individual samples, prior to any merging or integrating processes; by definition, doublet cells can only arise from a single library. Clusters principally driven by a population of cells with high predicted doublet scores may be discarded; otherwise, we recommend keeping all cells, retaining their doublet score metadata for examination downstream if required.

Cell type annotation.
In Sá da Bandeira et al., we combined marker data computed on the clusters defined above (using scran's scoreMarkers() function) with literature-based markers for cell type annotation. 1 Cell type annotations were assigned either based on the clustering computed above, or by the expression of known markers, as in the following code example: Note: Cell types present (especially rare cell types) may be found in different proportions, based on the specific samples sequenced. Cell types may also be combined or split based on the research question. During the annotation process, it is helpful to visualize the expression of literature-based markers, at both the cluster level and at a sample level using violin and t-SNE plots.
a. Merge the WT and KO samples using correctExperiments() from the batchelor Bioconductor package 9 ; this function applies a batch correction while combining the assay data and column metadata for downstream analysis. correctExperiments() retains batch information in the batch slot and we update this to a more useful label.
b. Apply dimensional reduction and clustering to the merged dataset, as described above (steps 10 and 11). c. t-SNE visualization of the merged dataset in conjunction with the cell type annotations defined above (step 12) should confirm a clustering of cells by cell type, rather than genotype. 14. Differential expression analysis.
a. Use scran's pairwiseWilcox() function to perform differential expression analysis between groups of cells in the merged SingleCellExperiment object. The function requires a vector of group assignments for the 'group' parameter, which is most easily specified as a combination of batch and cell type. The desired comparison can be made by specifying a vector of the relevant groups for the 'restrict' parameter.
b. The output from pairwiseWilcox() is a list of two elements: 'statistics' and 'pairs': 'statistics' (itself a list of DataFrames) contains the differential expression statistics, including the AUCs (the effect size), p-values and false discovery rate (FDR) values for each gene.
Note: Other statistical tests may be used (eg t-tests via the pairwiseTTests() function). Here, we used Wilcoxon rank sum tests since they are considered more robust to outliers and insensitive to non-normality in comparison to t-tests. We note, however, that the disadvantages of c. Manually curate the significantly overrepresented GO terms for terms relating to osteogenesis. d. Search GO terms of interest in the AmiGO web resource [http://amigo.geneontology.org/ amigo], which lists genes associated with a given GO term. 16. Download the associated genes for any GO terms of interest and cross-reference these with the DEGs to find genes which contributed most strongly to these GO terms (either by significance, or by the AUC (effect size) computed by the Wilcoxon rank sum test) (Figure 7).

Mesenchymal stem/stromal cell (MSC) derivation and culture
Timing: days to weeks We developed a mesenchymal cell culture to investigate whether the osteogenic developmental potential of PDGFRbÀ/À AGM-derived MSCs is impaired as suggested by our scRNA-seq data analysis in vivo. We first tested whether mesenchymal stem/stromal cell lines can be derived from single WT AGMs then whether MSCs can be derived in the absence of PDGFRb. The specific steps are as follows: 17. Prepare MSC medium. Note: This media has been previously described 4 and should be stored at 4 C for 1 month.

Seeding of AGM cells.
a. Pre-coat a 6-well plate with 2 mL of sterile 0.1% cold gelatin for at least 1 h at room temperature (20 C-22 C) or 10 min at 4 C. b. Gently aspirate out the remaining gelatin and proceed without washing.  a. In the first week, refresh the stromal medium only once, approx. 2-3 days after cells were first seeded. b. When the wells are >90% confluent (approx. 4-6 days), passage the cells from one well of a 6-well plate to one gelatin pre-coated T25 flask (=passage 1) using pre-warmed Trypsin + EDTA (0.25%) to detach the adherent cells. c. From here on, refresh the medium bi-weekly.  flask to one gelatin pre-coated T75 flasks (=passage 2) using 0.25% Trypsin + EDTA to detach the adherent cells for approx. 12-15 min. e. When the T75 flasks are >90% confluent (approx. 1 week), expand the culture at a 1:3 ratio (from one T75 to three T75 flasks and so on) for the following weeks and passages.
Note: When we initiate the culture with freshly harvested cells (passage 0), some MSC primary lines require 2-3 extra days to expand. This is a relatively short time and does not influence their growth or their differential potential at later stages. We were able to expand these cells and freeze/thaw them regularly for about 12 passages, independently of their genotype. However, their osteogenic potential was only tested between passage 3 and passage 6 while their hematopoiesis support was tested up to passage 12. 1 CRITICAL: Cells need to be kept in the incubator at 37 C and 5% CO 2 at all times.
f. After expansion, a fraction can be used for experiments and the remaining cells can be frozen and stored in liquid N 2 or passaged.

OPEN ACCESS
of PBS/FCS/PS. Keep the cryovial in the water bath at 37 C for a few seconds until the frozen block starts to detach, then rapidly transfer the frozen block to the 50 mL falcon tube in the tissue culture hood. Centrifuge cells at 4 C and 2,000 rpm for 5 min. Discard the supernatant, resuspend the pelleted cells with MSC medium and transfer to a gelatin pre-coated T75 flask, then incubate them at 37 C. After 24 h, renew the medium to discard the floating cells. Proceed as normal.
g. Both WT and KO stromal cells should show a fibroblast-like morphology and resemble MSCs (Figure 8). Note: A quantitative measure can be done in these samples by reading the plate on a spectrophotometer at a wavelength of 450 nm.

EXPECTED OUTCOMES
The calcium containing cells in the differentiated culture will be stained red and can vary from light to dark red (Figure 9). Most PDGFRb-KO stromal cell lines do not differentiate into calcium-containing cells.

LIMITATIONS
Dissecting embryos can be challenging, and thus, this step needs to be performed by a trained scientist. The gel doesn't show any bands (step 4).

Potential solution
The absence of bands may be due to a poor DNA extraction. This can be fixed by repeating the DNA extraction step followed by a new PCR and gel. Missing the polymerase from the PCR master mix could be another reason that can be fixed by repeating the PCR step.

Problem 2
Running the same R/Bioconductor function on the same data returns different results (step 7).

Potential solution
Many analysis functions use random processes, e.g., emptyDrops() performs Monte Carlo simulations to compute p-values, so setting a random seed via set.seed() is required to obtain reproducible results. It is therefore worthwhile testing several random seeds to ensure results are robust.

Problem 3
Clustering my scRNA-seq data yields too many/not enough cell clusters (step 11).

Potential solution
The choice of the connectivity parameter 'k' in buildSNNGraph() effectively controls the number of output clusters; smaller values correspond to more smaller and finer clusters and larger values result in fewer, more general, clusters.
Problem 4 I am struggling to annotate cell types based on the computed cluster markers (step 12).

Potential solution
Automated cell type annotation packages, e.g., SingleR, 20 may be used to predict cell types based on reference (e.g., FACS sorted or pre-annotated single-cell) datasets. However, the cell type prediction relies on the quality of annotations and/or cell types included in the reference dataset. These methods may also struggle with cell types that express similar markers (e.g., ECs, HECs, IAHCs in this dataset).

Problem 5
In the end of the alizarin staining, some differentiated (calcium containing) red cells detached and are now floating (step 22).

Potential solution
Indeed, the differentiated cells detach easily and may be lost during the staining. To avoid this, do not keep the cells in differentiating media for longer than 21 days as they are not expected to differentiate further. To avoid losing stained cells, each liquid used (Milli-Q water, alizarin red, and PFA) should be added or aspirated by using a P200 pipette with tips kept against the walls of the wells. This careful handling prevents any direct pressure on the cells or any contact with the cells.

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact [Mihaela Crisan] (Mihaela.crisan@ed.ac.uk).

Materials availability
This study did not generate new unique reagents. Data and code availability Original scRNA-seq data is available at NCBI GEO accession number GSE162103.